Credit Risk Modelling Platform

TL;DR

End-to-end credit risk system on 30,000 records. Champion/challenger architecture: WOE scorecard vs XGBoost — both tracked via MLflow. Monte Carlo simulation (500k scenarios) for VaR & CVaR. Three-scenario stress testing (Base / Mild / Severe). Fully deployed FastAPI backend + interactive HTML dashboard. IFRS 9 and Basel III methodologies applied.

30,000 Records

AUC Tracked via MLflow

500k Monte Carlo Scenarios

Live on Render

IFRS 9 / Basel III

Python FastAPI Machine Learning XGBoost Logistic Regression scorecardpy Monte Carlo MLflow DVC HTML/CSS/JS

Project Overview

The Credit Risk Modelling Platform is an end-to-end credit risk analytics system built on the UCI Taiwan Credit Card Default dataset (30,000 records). Designed to meet IFRS 9 Expected Credit Loss (ECL) and Basel III stress-testing requirements, it implements a champion/challenger model architecture — a WOE-based Logistic Regression scorecard as the champion and an XGBoost classifier as the challenger — enabling real-time comparison of interpretable and ensemble-based risk signals.

The platform exposes a FastAPI backend with a business-friendly input layer (10 plain-language fields abstracting 21 raw features) and a fully interactive HTML/CSS/JavaScript frontend covering prediction, portfolio analytics, Monte Carlo simulation, stress testing, and sensitivity analysis. Experiment tracking is handled by MLflow and the training pipeline is managed via DVC with a params.yaml configuration file.

Problem Statement

Financial institutions face significant uncertainty in credit decision-making. Without a structured, data-driven approach, lenders rely on subjective judgment — leading to inconsistent approvals, under-provisioned capital buffers, and regulatory non-compliance. Key challenges include:

No consistent methodology for evaluating borrower default risk across a portfolio.
Inability to quantify portfolio-level exposure — lenders cannot see total expected losses or how losses concentrate in tail scenarios.
Poor stress resilience visibility — banks cannot answer "what happens to our losses if the economy contracts by 30%?" without manual, one-off analyses.
Regulatory pressure — IFRS 9 mandates ECL reporting; Basel III requires stress-tested capital adequacy — both are difficult to produce without an automated system.
Model opacity — black-box models cannot explain a credit decision to a borrower or regulator, creating legal and reputational risk.

This platform solves all five problems: it produces consistent, explainable credit scores; quantifies ECL across the full portfolio; stress-tests losses under adverse scenarios; and maintains a challenger model for ongoing performance benchmarking — all through a single unified system.

Key Insights

Champion/Challenger architecture allows continuous model benchmarking — the scorecard provides regulatory-friendly explainability while XGBoost maximises predictive accuracy.
WOE transformation enforces monotonic risk relationships and produces an interpretable credit score in the 576–906 range, with score bands mapping directly to lending decisions.
Business input layer abstracts 21 raw dataset columns into 10 plain-language fields — non-technical users never interact with raw model features.
Monte Carlo simulation (up to 500,000 scenarios) quantifies tail risk metrics — Value at Risk (VaR) and Conditional VaR (CVaR) — that deterministic ECL alone cannot capture.
Three-scenario stress testing (Base, Mild +30% PD, Severe +80% PD) lets risk managers evaluate capital adequacy before adverse conditions materialise.
Sensitivity analysis reveals that LGD has an elasticity of ~2.2× relative to PD, meaning recovery strategy improvements outperform credit selection tightening dollar-for-dollar.

Technical Implementation

Model Architecture:
- Scorecard: raw data → feature engineering (15+ derived features) → WOE binning via scorecardpy → Logistic Regression → additive points table → credit score.
- Challenger: same engineered features → sklearn Pipeline (OrdinalEncoder + XGBClassifier with scale_pos_weight=3.52 for class imbalance) → probability output.
- Feature selection removed collinear, legally sensitive, and negative-coefficient columns; manual WOE bin breaks enforced monotonicity for the UTILIZATION feature.
Training Pipeline (mlops/):
- Reproducible end-to-end pipeline: load → clean → engineer → select → WOE bin → two-pass LR (first pass surfaces negative coefficients) → final LR + scorecard table → XGBoost with 5-fold CV.
- Hyperparameters versioned in params.yaml; all runs logged to MLflow including AUC, KS statistic, Gini coefficient, confusion matrix, and ROC curve artifacts.
API Layer (FastAPI):
- Six route groups: /predict, /predict/business, /ecl, /simulate, /stress-test, /sensitivity.
- Pydantic schemas enforce input validation; CORS middleware allows the standalone frontend to call the API without a proxy.
- Business input routes accept 10 plain-language fields and internally call input_mapper.py to reconstruct all 21 raw dataset columns before inference.
Risk Analytics Services:
- ECL service — computes PD × LGD × EAD per borrower with optional segment-level breakdown; returns individual and portfolio totals.
- Monte Carlo service — vectorised NumPy simulation producing Expected Loss, Unexpected Loss, VaR, CVaR, min/max, and a 200-point loss distribution sample for charting.
- Stress testing service — applies PD multipliers and LGD overrides from risk_config.py; optionally overlays Monte Carlo on each scenario.
- Sensitivity service — sweeps relative PD shifts and absolute LGD shifts, returning ECL change percentage and elasticity at each point.
Frontend Dashboard (HTML/CSS/JS):
- Six pages: Dashboard, Prediction, Risk Analytics, Simulation, Stress Test, Sensitivity — all sharing a unified dark-themed design system.
- Stress test results rendered via Chart.js bar chart; simulation results display VaR / CVaR metrics in a responsive grid layout.

Video Preview

Key Learnings

Scorecard development requires two LR passes — the first surfaces negative coefficients that violate monotonicity; removing them before the second pass is standard industry practice, not optional cleanup.
WOE binning is sensitive to auto-generated boundaries; manual breaks are sometimes necessary to enforce the risk ordering regulators expect.
Separating the business input layer from raw model features is architecturally essential — it decouples frontend UX from model internals and makes the API safe for non-technical integrations.
Monte Carlo simulation reveals tail risk that deterministic ECL masks completely — two portfolios with identical ECL can have very different VaR profiles depending on PD distribution shape.
MLflow experiment tracking becomes indispensable the moment you run more than a handful of training experiments; reproducing a specific run without it is extremely difficult.
Regulatory frameworks (IFRS 9, Basel III) are not abstract — building to their requirements from the start (ECL methodology, stress scenario definitions, model documentation) is far cheaper than retrofitting compliance later.

Future Work

Add a model card with fairness metrics and feature importance for regulatory documentation.
Integrate a real-time data pipeline (Kafka or Airflow) so the platform ingests live transaction data rather than batch uploads.
Implement model drift detection — PSI (Population Stability Index) monitoring on score distributions over time.

View Live GitHub

Built by Om Patel — ML Engineer & Data Scientist.
Explore more projects on my Portfolio.